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SUMMARY 

An urn model of Diaconis and some generalizations are discussed. A conver- 
gence theorem is proved that implies for Diaconis' model that the empirical 
distribution of balls in the urn converges with probability one to the uniform 
distribution. 
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1 Introduction 

Diaconis has formulated the following simple urn model. 
EXAMPLE 1. Let G be a finite group, generated by gi,...,gr. Initially, 
an urn contains r balls, each labeled by one of the generating elements. At 
times n = r + l,r + 2, ■ ■ ■ two balls are drawn with replacement from the 
urn. The labels on these balls are multiplied to form a new group element. 
A ball, bearing this element as its label, is then added to the urn, increasing 
the number of balls in the urn by one. Let Xk be the label indicator with 
respect to the kth ball (i.e., Xk is a vector of length \G\, with a one placed 
in the coordinate associated with the ball's label and zeros elsewhere.) Let 
P9,n — Ylk=i ^{Xg fc=i}/^ denote the relative frequency of balls labeled g when 
the total number of balls in the urn is n. As an application of Theorem 2 
below, we verify a conjecture of Diaconis, that Pg^n l^l""*^) 9 ^ G, 

as n ^ oo with probability one. 

EXAMPLE 2. A special case of Example 1 occurs when the balls are num- 
bered either or 1 and the group operation is addition modulo 2. Then 
the fraction of I's in the urn after n draws, converges to 1/2 with probability 
one. As a variation of this special case one can draw k >2 balls from the urn 
with replacement and add a or a 1 according as the number of I's drawn is 
even or odd. Again the fraction of balls numbered 1 converges to 1/2 with 
probability one. 

EXAMPLE 3. For an example motivated by a classical model in population 
genetics (e.g. Ewens (1969)), we suppose that the population size in a pure 
birth process at the nth generation is fc„ > n. The population consists of 
three kinds of individuals corresponding to the three biallelic genotypes AA, 
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Aa, and aa, which have relative fitness (i.e., probabihty of reproduction ) 
of 1 — s, 1, I — t, respectively. We assume s < 1, i < 1. In the most 
interesting special case 0<s<l, 0<t<l, so the heterozygote Aa has 
the greatest fitness. Let p„ denote the fraction of A alleles in the population 
at the n generation. Then under random mating the relative proportions 
of AA, Aa and aa genotypes that reproduce in the n + 1st generation are 
— s) : 2p„(l —pn) '■ (1 — Pn)^(l — t)- We assume that reproduction occurs 
independently of the population size process. Does the fraction j9„ converge 
and what is its limit? In this example it is natural to assume that kn grows 
exponentially, so that the number of balls added to the urn in each generation 
is comparable to the number of balls already in the urn. One could also add 
this feature to Examples 1 and 2. 



2 Convergence to a fixed point 

Consider a finite set G. Let G* be the simplex of probability distributions 
over G and let T : G* — > G* be a map of the simplex into itself. The 
point g e G* is a fixed point of the transformation if T{q) = q. Below we 
investigate almost-sure convergence of the stochastic sequence of empirical 
distributions {pn}, define by the recursion: 

where {kn} is a monotone sequence of integer-valued random variable (i.e. 
kn+i > kn + i, for all n), and Xi is a random vector that indicates an element 
from G. The integer ko is positive and po is a given initial distribution vector. 
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Consider the filtration — cr{Xi, ■ ■ ■ , ^k„, ki, - ■ ■ , kn, kn+i}, for n > 1. We 
assume that, conditional on 

^ ~ Multinomial(T(p„), - kn), (1) 

i=kn+l 

and identify sufficient conditions to ensure the convergence of Pn to a con- 
tracting (cf. assumption Al below) fixed point of the transformation T. 

Our argument is a two-fold application of the almost supermartingale 
convergence theorem of Robbins and Siegmund (1971). We begin with a 
statement of that theorem: 

Theorem 1. Let Zn, Cn be non-negative random variables adapted to the 
increasing sequence of a -algebras Tn- Suppose that for each n, 

Then lim Z„ exists and is finite and ^ Cn < oo almost surely on the event 
where ^ ^„ < oo. 

Our main result relies on the following assumptions on the transformation 
T, the sequence {/c„} and the initial distribution pq: 

Al: The collection Q — {qo,qi, - ■ ■ , ?j} of fixed points of T is non-empty and 
finite and the fixed point go is contracting, i.e., \\T{p) — qo\\ < \\p — qo\\, 
for all p E G* — Q. The point go "^^J be in the interior of G*, but all 
other fixed points are on the boundary (i.e. their supports are proper 
subsets of G) . 

A2: For all j > let Cj be a vector with O's in those coordinates where g^ 
has positive mass and I's in those coordinates where g^ has no mass. 
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Assume Cj is not equal to the zero vector (which is equivalent to assum- 
ing that Qj is on the boundary of G*). Further assume that {cj,po) > 
and for p not orthogonal to Cj, limmfp^q.{cj, T{p)) / {cj,p) > 1. 

A3: The increasing sequence, kn, of random integers satisfies kn+i/kn < C, 
for all n and for some constant C > 1 such that C — 1 < min{ | — q'j 1 1 : 

Theorem 2. Under the above assumptions, Pn — > Qo with probability one as 
n —>■ oo. 

PROOF. The proof consists of apphcations of Theorem 1 to (a) Z„ = ||p„ — 
goP and (b) Z„ = l/(cj,p„). Consider first case (a). Let 7r„+i = (fcn+i — 
kn)/kn+i and define Xn+i = Y^lkl+i^il i^n+i - K)- Observe that - 
qQ — {1 — 'Kn+i){Pn ~ Qo) + 7r„+i(X„+i — go)- We take the the conditional 
expectation given ^„ of the squared norm of this identity and use the facts 
that (i) E(X„_|_i|jF„) = T(p„) and (ii) the (conditional) second moment of a 
random variable is the sum of its variance and the square of its expectation. 
Then by regrouping terms and using the Cauchy-Schwarz inequality and 
conditions Al, A3 we see that 

E{Zn+l\Tn) ^ Zn- 2TTn+l{l - 7r„+i) [Z„ - (p„ - go, T{pn) - Qo)] 

+nl+, mWX^+l - nPn)|H^n) + \\T{Pn) - Qof - ^n] 
^ ^ 2 ~ few /j^ ll^(Pn) ~ go|| \ ^ — kn 

Hence by Al and Theorem 1, since 



oo 



Ekn+1 — kn , dx 
— P ^ / 

„=o '^^+1 -^0 ^ 
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we see that with probabihty one, hm Zn exists and is finite and the negative 
terms of the process are summable. By the non-negativity of the terms 
involved and by the fact that 



E 

n=0 



n+1 



kr, 



dx 

> I — = 00, 

feo ^ 



we can conclude that either Z„ — > or [[^(pn) — goll/IIPn — <lo\\ — ^'n^oo 1- 
However, only fixed points produce equality in the contraction inequality. 
Consequently by A3, with probability one p„ converges to some qj e Q, the 
set of fixed points. 

To eliminate the possibility that some qj with j' > is the limit, we 
consider case (b): Z„ = l/{cj,pn)- Indeed, we let Aj = {p„ — > qj} and show 
that Zn converges to a finite limit on Aj, which would be a contradiction 
unless ^{Aj) = 0. This will complete the proof of the theorem since p„ must 
converge to a fixed point. 

We turn to proving the convergence of {Zn} on Aj. Define Sn+i — 
{cj,Y!ilkl+i^i)^ Pn = {cj,Pn) and f{pn) = {cj,T{pn)). Notc that pn+1 = 
[knpn + " kn) Sn+i] / kn+1- Conditional on JF„, S'„+i is the sum of a 

subset of the coordinates of a multinomial vector and hence is distributed as 
Binomial( kn+i — kn,T(pn))- Now 



^[Zn+l\^n\ — E 



n+1 



knPn + ^1 



n+1 



T 



E 

s=0 



k 



n+1 



knPn + S 



n+1 



s\Tn)- 



The relations P(^„+i = 0|J^„) = 1 - E^lt'"*'" ^l-^n+i = s|^n) and + 

S) - l/{knPn) = -s/{knPn + s) • l/{knPn) produCCS 

S ■ r{Sn+l = S\j^n 



kn+1 kn 
knPn 



k 



n+1 



kn+1 kn 



E 

s=l 



knPn + S 



(2) 
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We will proceed by showing that on the event {p„ ^ 0} D Aj the term in the 
square brackets is eventually strictly negative. Therefore, the positive part 
is summable, and Theorem^ can be used in order to conclude that limZ„ 
exists and is finite. 

We analyze separately the cases: (i) E(5„+i|jF„) < e, (ii) e < E(5'„+i|JF„) < 
M, and (iii) E(S'„+i|jF„) > M, for some prespecified < e < Af < oo to be 
determined later. 

Consider case (i). By the monotonicity of the function x/ (a+x) we obtain 
the inequahty 



< 



1 - 



h . — h 



knVn + 1 

Now, P(S„+i > l\J^n) = 1 - (1 - T(p„))^'"+^^'=" > {kn+i - A;„)T(p„)(l - e/2), 
which leads to the inequality 



< 



l-(l-e/2)- 



If knpn —>■ OO, then assumption A2 will produce a negative limit provided 
that e is small enough. 

To prove that oo, it is sufficient to prove that Xl^o -^{5„+i>i} 

almost surely infinite. Equivalently, it is enough to show 

oo oo 



n=no 



n=no 



for an appropriate uq. However, p„ > {cj,po)/kn, and the statement follows 
from the fact that {{kn+i — kn)/k„} has an infinite sum. 

Next consider case (ii). Since T{pn) —>■ 0, we must have that kn+i — kn 
oo and thus Sn+i behaves in distribution like a Poisson random variable 
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(conditional on . This time we use the inequahty 

1 _ / Sfi+i 



< 



-E 



(K+l - kn)Pn V 1 + Sn+l/knPr, 

Case (ii) imphes a lower bound on the term {kn+i — kn)Pn and a stochastic 
upper bound on the random variable Sn+i- It follows that the conditional 
expectation ~ K{Sn+i\J^n) — {kn+i — kn)T(j)n), which produces a negative 
value in the square brackets, by A2. 

Finally, consider case (iii). By monotonicity one gets that 

s ^ y ■ I{s>y} 
a + s ~ a + y 

and, upon selecting y — {1 — ei)E(S'„+i|jF„), the inequality 

^ \ kn+l^{Sn+l > (1 - 6i)E(g^+i|^J|.F,,) 
~ . kn\p„/{l - ei)f{pn)] + {kn+1 - kn) 

Chernoff 's inequality leads to the upper bound 

t. 2., 1 



1 - 



;i-e-V) 



^n[Pn/(l - ei)T(p„)] + {kn+1 - k^) 

Selection of a large enough M and a small enough ei will lead to a negative 
limit, provided that {kn+i — kn)/kn is bounded. This last condition is assured 
by assumption A3. 



3 Applications 



EXAMPLE 1. In the urn model of Diaconis the transformation takes the 
form: 

{T{p))g = ^Pg h-^Ph, for g eG. 

heG 
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Any uniform distribution over a subgroup is a fixed point of this transforma- 
tion. Conversely, any fixed point is a uniform distribution over a subgroup. 
The last statement follows from the fact that the support of a fixed point is 
a subgroup since the support is closed under group operations and the group 
is finite. Moreover, by the definition of a fixed point, the probability of each 
element in the support must be equal to the maximum of all probabilities 
unless a contradiction is to occur. The collection of uniform distributions 
over subgroups is finite. 

Denote by go the uniform distribution over the entire group. Viewing 
{^heGPg-h-^Ph}^ as the square of the expectation of the random variable 
taking on the value ph with probability Pg-h-^, we obtain from the Cauchy- 
Schwarz inequality that J2geG {Y.heGPgh-^PhY < Y^geGPl^ ^^^^ strict in- 
equality unless Ph is constant on its support. From this and direct computa- 
tions, we see that T is contracting, so condition Al is met. 

Let Gj be a proper sub-group of G. Observe that {cj,p) assigns a proba- 
bility to G\Gj. A product of two group elements, one belonging to Gj and 
the other not belonging, produces a group element not belonging to Gj. It 
follows that 

{cj,T{p))>2{c^,p){l-{cj,p)). 

If po assigns positive probabilities to generators of G then {cj,po) > and 
condition A2 is fulfilled. 

EXAMPLE 2. From the elementary fact that when a coin is tossed k times, 
the probability of an odd number of heads is [1 — (1 — 2p)'^]/2, one can verify 
the conditions of the theorem, to show that p„ — > 1/2 with probability one. 
It is perhaps interesting to note that when k is even the transformation T{p) 
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is concave; when k is odd, it is concave to the left of 1/2 and convex to the 
right of 1/2. 

EXAMPLE 3. From the assumption of random mating it follows that T{p) = 
p{l — ps)/[l — p^s — (1 — from which it easily follows that and 1 

are fixed points of T. If s and t are both positive or both negative, then 
q* — t/{s + t) is also a fixed point; otherwise and 1 are the only fixed 
points. It is straightforward to show that when s and t are both positive, 
the interior point t/{s + t) is attracting, so p„ ^ t/(s + t) with probability 
one. (Like Example 2, T is concave to the left of q* and convex to the right.) 
When s is nonpositive and t is positive, the fixed point at 1 is attracting, 
and conversly in the case when s is positive and t nonpositive. li s — t — 0, 
every point in [0,1] is a fixed point, the sequence p„ is a martingale, which 
converges with probability one to a random limit. In the case when both s 
and t are negative, the fixed point at t/{s + t) is not attracting. It seems 
intuitively clear that Pn must converge to or 1, but this does not seem to 
follow from Theorem 2 without an additional argument. 
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